Evading Anomaly Detection through Variance Injection Attacks on PCA
نویسندگان
چکیده
Whenever machine learning is applied to security problems, it is important to measure vulnerabilities to adversaries who poison the training data. We demonstrate the impact of variance injection schemes on PCA-based network-wide volume anomaly detectors, when a single compromised PoP injects chaff into the network. These schemes can increase the chance of evading detection by sixfold, for DoS attacks. 1 Motivation and Problem Statement We are broadly interested in understanding vulnerabilities associated with using machine learning in decision-making, specifically how adversaries with even limited information and control over the learner can subvert the decision-making process [1]. An important example is the role played by machine learning in dynamic network anomography, the problem of inferring network-level OriginDestination (OD) flow anomalies from aggregate network measurements. We ask, can an adversary generate OD flow traffic that misleads network anomography techniques into misclassifying anomalous flows? We show the answer is yes for a popular technique based on Principal Components Analysis (PCA) from [2]. The detector operates on the T×N link traffic matrixY, formed by measuring N link volumes between PoPs in a backbone network, over T time intervals. Figure 1 depicts an example OD flow within a PoP-to-PoP topology. Lakhina et al. observed that the rows of the normal traffic in Y lie close to a low-dimensional subspace captured by PCA using K = 4 principal components. Their detection method involves projecting the traffic onto this normal K-dimensional subspace; large (small) residuals, as compared with theQ-statistic, are called positive anomalies (negative normal traffic). 2 Results and Future Work Consider an adversary launching a DoS attack on flow f in week w. Poisoning aims to rotate PCA’s K-dimensional subspace so that a false negative (FN) occurs during the attack. Our Week-Long schemes achieve this goal by adding high variance chaff at the compromised origin PoP, along f , throughout week w − 1. R. Lippmann, E. Kirda, and A. Trachtenberg (Eds.): RAID 2008, LNCS 5230, pp. 394–395, 2008. c © Springer-Verlag Berlin Heidelberg 2008 Evading Anomaly Detection through Variance Injection Attacks on PCA 395 Fig. 1. Point-of-presence (PoP)-level granularity in a backbone network, and the links used for data poisoning 1.0 1.1 1.2 1.3 1.4 1.5 0. 0 0. 2 0. 4 0. 6 FNR vs. Relative Link Traffic Increase Relative mean link traffic volume after attack A ve ra ge te st F N r at e ●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Add−More−If−Bigger Scaled Bernoulli Fig. 2.Week-Long attacks: test FNRs are plot against the relative increase to the mean link volumes for the attacked flow Figure 2 presents results for two chaff methods. Both methods add chaff ct to each link in f at time t, depending on parameter θ: Scaled Bernoulli selects ct uniformly from {0, θ}; Add-More-If-Bigger adds ct = (yo(t)− yo) where yo(t) and yo are the week w − 1 origin link traffic at time t and average origin link traffic, respectively. We evaluated these methods on data from Abilene’s backbone network of 12 PoPs. For each week 2016 measurements were taken, each averaged over 5 minute intervals, for each of 54 virtual links—15 bi-directional inter-PoP links and the PoPs’ ingress & egress links. The attacker’s chance of evasion is measured by the FN rate (FNR). We see that the Add-More-If-Bigger chaff method, which exploits information about origin link traffic, achieves greater FNR increases compared to Scaled Bernoulli. The baseline FNR of 4% for PCA on clean data, can be doubled by adding on average only 4% additional traffic to the links along the poisoned flow. The FNR can be increased sixfold to 24% via an average increase of 10% to the poisoned link traffic. In our Boiling Frog strategies, where poisoning is increased slowly over several weeks, a 50% chance of successful evasion can be achieved with a modest 5% volume increase from week-to-week over a 3 week period [3]. We have verified that simply increasing the number of principal components is not useful in protecting against our attacks [3]. In future work we will evaluate counter-measures based on Robust formulations of PCA, and will devise poisoning strategies for increasing PCA’s false positive rate.
منابع مشابه
Moving dispersion method for statistical anomaly detection in intrusion detection systems
A unified method for statistical anomaly detection in intrusion detection systems is theoretically introduced. It is based on estimating a dispersion measure of numerical or symbolic data on successive moving windows in time and finding the times when a relative change of the dispersion measure is significant. Appropriate dispersion measures, relative differences, moving windows, as well as tec...
متن کاملDynamic anomaly detection by using incremental approximate PCA in AODV-based MANETs
Mobile Ad-hoc Networks (MANETs) by contrast of other networks have more vulnerability because of having nature properties such as dynamic topology and no infrastructure. Therefore, a considerable challenge for these networks, is a method expansion that to be able to specify anomalies with high accuracy at network dynamic topology alternation. In this paper, two methods proposed for dynamic anom...
متن کاملCombining Adaptive Filtering and IF Flows to Detect DDoS Attacks within a Router
Traffic matrix-based anomaly detection and DDoS attacks detection in networks are research focus in the network security and traffic measurement community. In this paper, firstly, a new type of unidirectional flow called IF flow is proposed. Merits and features of IF flows are analyzed in detail and then two efficient methods are introduced in our DDoS attacks detection and evaluation scheme. T...
متن کاملAnomaly traffic detection based on PCA and SFAM
Intrusion Detection System (IDS) has been an important tool for network security. However, existing IDSs that have been proposed do not perform well for anomaly traffics especially Remote to Local (R2L) attack which is one of the most concerns. We thus propose a new efficient technique to improve IDS performance focusing mainly on R2L attacks. The Principal Component Analysis (PCA) and Simplifi...
متن کاملTampered Data Recovery in WSNs through Dynamic PCA and Variable Routing Strategies
Wireless sensor networks (WSNs) are highly sensible to data integrity attacks, which have an important impact on a number of relevant deployments and services. This paper introduces a tolerance approach to fight against data modification attacks in WSNs, which is based on a missing data imputation scheme. The proposal relies on two principal contributions: (1) a multivariate statistical techniq...
متن کامل